Fast String Matching with Mismatches
نویسندگان
چکیده
We describe and analyze three simple and fast algorithms on the average for solving the problem of string matching with a bounded number of mismatches. These are the naive algorithm, an algorithm based on the Boyer-Moore approach, and ad-hoc deterministic nite automata searching. We include simulation results that compare these algorithms to previous works.
منابع مشابه
String Matching with Mismatches by Real-Valued FFT
String matching with mismatches is a basic concept of information retrieval with some kinds of approximation. This paper proposes an FFT-based algorithm for the problem of string matching with mismatches, which computes an estimate with accuracy. The algorithm consists of FFT computations for binary vectors which can be computed faster than the computation for vectors of complex numbers. Theref...
متن کاملA Note on Randomized Algorithm for String Matching with Mismatches
Abstract. Atallah et al. [ACD01] introduced a randomized algorithm for string matching with mismatches, which utilized fast Fourier transformation (FFT) to compute convolution. It estimates the score vector of matches between text string and a pattern string, i.e. the vector obtained when the pattern is slid along the text, and the number of matches is counted for each position. In this paper, ...
متن کاملA Fast Algorithm for Approximate String Matching on Gene Sequences
Approximate string matching is a fundamental and challenging problem in computer science, for which a fast algorithm is highly demanded in many applications including text processing and DNA sequence analysis. In this paper, we present a fast algorithm for approximate string matching, called FAAST. It aims at solving a popular variant of the approximate string matching problem, the k-mismatch p...
متن کاملFast and Practical Approximate String Matching
We present new algorithms for approximate string matching based in simple, but eecient, ideas. First, we present an algorithm for string matching with mismatches based in arithmetical operations that runs in linear worst case time for most practical cases. This is a new approach to string searching. Second, we present an algorithm for string matching with errors based on partitioning the patter...
متن کاملFFT-based algorithms for the string matching with mismatches problem
The string matching with mismatches problem requires finding the Hamming distance between a pattern P of length m and every length m substring of text T with length n. Fischer and Paterson’s FFT-based algorithm solves the problem without error in O(σn logm), where σ is the size of the alphabet Σ [SIAM–AMS Proc. 7 (1973) 113–125]. However, this in the worst case reduces to O(nm logm). Atallah, C...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Inf. Comput.
دوره 108 شماره
صفحات -
تاریخ انتشار 1994